Non-invasive estimation of the powder size distribution from a single speckle image

Non-invasive characterization of powders may take one of two approaches: imaging and counting individual particles; or relying on scattered light to estimate the particle size distribution (PSD) of the ensemble. The former approach runs into practical difficulties, as the system must conform to the working distance and other restrictions of the imaging optics. The latter approach requires an inverse map from the speckle autocorrelation to the particle sizes. The principle relies on the pupil function determining the basic sidelobe shape, whereas the particle size spread modulates the sidelobe intensity. We recently showed that it is feasible to invert the speckle autocorrelation and obtain the PSD using a neural network, trained efficiently through a physics-informed semi-generative approach. In this work, we eliminate one of the most time-consuming steps of our previous method by engineering the pupil function. By judiciously blocking portions of the pupil, we sacrifice some photons but in return we achieve much enhanced sidelobes and, hence, higher sensitivity to the change of the size distribution. The result is a 60 × reduction in total acquisition and processing time, or 0.25 seconds per frame in our implementation. Almost real-time operation in our system is not only more appealing toward rapid industrial adoption, it also paves the way for quantitative characterization of complex spatial or temporal dynamics in drying, blending, and other chemical and pharmaceutical manufacturing processes.


Forward model and background fluctuation analysis
Forward Model Theory.
The forward problem theory establishes a relationship between a specific particle size distribution (PSD) and the raw speckle pattern.It treats the PSD as a probability density function, which determines the ensemble autocorrelation function of the raw speckle.A simplified analytical model is shown in Fig. S1.Without loss of generality, this model assumes that only particles can scatter light and considers the particles as a thin optical mask () with unity reflectivity.
According to the reference 1 , we can safely assume the () = ().Thus, the equation ( 11) is simplified as, Since () is the reflection envelope of the particles, we further assume () is a binary function, () becomes the power spectral density of ().The following sections will discuss different ways to model ().

Random impulse process
For the simplest case, we neglect the particle size and shape.() is a set of points at positions  1 ,  2 , …,   , which are uniformly distributed in the illumination region with a total number of K.
We define the probability distribution as the normalized pupil function () = () ∫ () , () is the pupil function.
We are interested in the ensemble average of ().
We need to consider two different sets of terms,  =  and  ≠ .In the double summation, there exist K separate terms for which  = , each one contributes unity.These terms contribute in 〈()〉.
In addition, there are  2 −  terms having  ≠ .  and   are independent to each other, there () is the Fourier transform of the normalized pupil function ().Considering these two sets of terms, The ensemble averaged autocorrelation is proportional to the power spectral density of the pupil function, with an offset which is inverse proportional to the total number of particles K in the illumination region.
Linear filtered random impulse process Considering the finite particle size and shape, we assume the particle at   has the envelop ℎ.
is the Fourier Transform of particle envelop ℎ() and (0 is the normalization factor to make sure the total region covered by the particles is fixed to unity.Following the same procedure of equation (S15)-(S18), If ℎ follows a random distribution, (S19) and (S20) should be reformulated into, Here we ignore the normalization factor for ℎ() for simplicity, we will add it back after we get the conclusion.For most cases, the random distribution of ℎ  () is independent to .
Based on the discussion related to (S16), we need to group terms in (S24) into two sets.There exist The final expression of 〈()〉 is, is the normalization factor.  (0) is the standard deviation when  = 0.If we ignore the last term in (S25), Specifically for our powder, in 1D case, we assume that ℎ  ( −   ) = ( ), () =  ().  is the particle size for -th particle. follows the distribution (), which is what we want to know.
In 2D case, This expression is the same as equation ( 1) in the main text.

Sensitivity Analysis
Let us discretize the particle size distribution as Here, Δ is the sampling sensitivity of the particle size distribution,  is the measurement complexity (i.e., the number of parameters that we choose to represent the problem) and ( − 1)Δ ≡   , the range of sizes for the measurement.We shall refer to the   ′s as the impulse-like parametric description of the particle sizes.It follows that Here, of course, With a slight abuse of notation, let us now define, and allow Δ → 0,  → 0, while   remains fixed.Then we obtain the equation.( 2) in the main text.

Sample preparation and data collection
To calibrate the speckle, we used commercially available potassium chloride (KCl) powder, which has block-shaped crystals as its morphology.We prepared 18 samples of KCl with different size distributions by sieving bulk KCl using a sieve shaker equipped with sieves of opening sizes (500 µm, 425 µm, 355 µm, 300 µm, 250 µm, 180 µm, and 106 µm).We stacked the sieves in the sieve shaker in order of decreasing sieve opening size from top to bottom.We carried out the sieving process for 15-30 minutes until the weight of each sieve with powder remained constant.We added each sieved, dry powder sample individually to a filter-dryer and stirred at 4 rpm to record 500 frames of the speckle pattern.This process generated 18 calibration data sets corresponding to the sieved KCl samples.To evaluate the generalization ability of our neural network model, we choose 9 sample sets to fine-tune the first stage of the neural network, whereas the remaining 9 sets are the test dataset which is disjoint from the whole training process.Both datasets are kept away from the training process for the rest of the neural network.
Offline particle size analysis with Malvern Mastersizer 2000 attached to a Scirocco 2000 dry dispersion unit was used to obtain PSD data as the ground truth for each KCl sample set.Fig. S2 shows the PSD data obtained from Mastersizer corresponding to each sieved sample.The finite sieves and master-sizer measurement limit the number of datasets we can collect, as they are timeconsuming and unrecyclable.

Neural network structure and hyperparameters
The neural network structure is shown in

Apparatus and drying experiments
Our apparatus including the optics and the filter dryer is nearly the same as the reference 1 , except adding a 3D printed intensity mask on the beam path to shape the pupil.
Optics bench.The wet solid is sealed in the dryer and the laser beam is delivered through a glass window.The angle of incidence on the surface is chosen to be approximately 10 degrees, so as to avoid specular back-reflection from the window onto the camera.The optical beam path is shown in Fig. 4(a).The 532 nm laser beam is expanded to 4.8 mm with a beam expander in the telescope configuration, consisting of lenses L1 (focal length 25 mm) and L2 (30 mm).An intensity mask is plugged to shape the beam after L2, with a 10 mm gap.The beam is initially polarized in the direction parallel to the plane shown in the diagram, and so it is reflected by the polarizing beam splitter (PBS) to reach the sample inside the dryer through the glass window.Passage twice through the quarter-wave plate rotates the outgoing polarization direction from parallel to perpendicular so that the scattered light from the powder sample is now transmitted through the PBS and propagates vertically upwards.The wave plate is tilted by approximately 10 degrees for the same reason as the beam, to minimize specular back-reflection.Lens L3 (250 mm) concentrates the scattered light so that an angular range that is as large as possible can be captured by the CCD (model ZWO ASI183MM Pro).
The laser model is Excelsior 532 Single Mode with 300 mW output power.After the fiber coupling, only 230 mW power injects into our system.Our lenses are made of N-BK7 with anti-reflection coating, whose transmission is 99.5% at 532 nm.The beam has a polarization ratio higher than 100:1.The extinction ratio of the polarized beam splitter (PBS) is higher than 1000:1.The transmission of the quarter plate and the silica window are 99.8% and 90%, respectively.The beam power on the sample surface is 203 mW.The power of the scattered light measured at the CCD plane is 120 μW.The coherence length of the laser is 25 m, which corresponds to a temporal bandwidth of 0.01 pm.This results in sufficient temporal coherence to produce sharp speckles.

Image acquisition.
The monochromatic CCD model is ZWO ASI183MM Pro with 5496 × 3672 pixels with a 2.4 μm pixel size, we run it in the bin-pixel mode with 2744 × 1836 pixels and 45 fps framerate.200 μs exposure time is small enough to maintain a high degree of spatial correlation 1 .The typical size of the speckle pattern calculated from / is 28 µm for our imaging system, which is larger than the pixel size to ensure a good resolution of the speckle pattern.
Filter dryer.The filter drying device shown in Fig. S4(b) was designed based on a prototype 5 .The device has 1400 ml capacity with overall device dimensions of 330 mm height × 220 mm width × 220 mm length without the optical components.The body, impeller, and filter mesh are made of steel and the lid is made of aluminum.The lid contains a vacuum port, an air/gas input port, a camera port, two wash solvent ports, a feed suspension port, and a glass observation window.A vacuum gauge and pressure release valve are attached to the lid.The impeller is connected to a motor on the top of the lid.The impeller blade is designed with teeth to promote cake depumping.
An inductive heater and a thermocouple are attached to the bottom of the device to control the drying temperature.This inductive heating plate is used to heat the steel base and the steel body of the dryer.An endoscope camera is attached to the lid to record process videos of the drying material from the top.The optics bench sheds light on the sample and collects the scattered beam through the observation window.

Generality test
The neural network is trained by KCl powder, in principle, dataset collection and network training have to be performed for each new material.However, if the new material has a similar geometry, refractive index, absorption spectrum to KCl, such as sodium chloride (NaCl) powder and, hence, similar statistics for the scattered field, then the neural network trained by KCl should still able to estimate the particle size distribution.We performed a generality test with NaCl powder, maintaining the same sample preparation and measurement procedure.Test results are shown in  We also performed a stress test with bimodal distribution, which completely fails with the clear pupil case discussed in the previous work 1 .There are two reasons: (1) according to equation (1), the bigger particles contribute more to the size modulation term than the smaller ones, so it is harder to catch the smaller peak in the bimodal distribution; (2) Each || provides an effective equation to solve the inverse problem, but the power spectral density of the clear pupil is rotationally symmetric, there are only few individual peaks having enough contrast in the radical coordinate.Therefore, it is too ill-posed to get a bimodal distribution from only few equations.
The pupil engineering method helps to release the point (2) because our asymmetric design.It can offer a continuous high-contrast pupil power spectral density in the radical coordinate, resulting in more effective equations to solve the PSD.We trained the neural network with a mixture of simulated single and bimodal autocorrelation-PSD pairs.The test results are plotted in Fig. S6.
Although some examples still confuse the model, it is much better than the result in PEACE 1 where the model never figure out the bimodal distributions.For the failure examples, the model can only capture the big peak, which is consistent with the comment (1) in the last paragraph.

The estimation and the ground truth of the PSD monitoring
In Fig. S7, we compared the PSD estimations at the beginning and ending time with the Mastersizer measurements as the ground truth.They match well and both methods validated that the possible crystal breakage 6 did not happen in this experiment.We did not compare the estimated PSD with the Mastersizer during the drying process because Mastersizer does not support in-situ measurement for wet powder.We need either wait until it fully dries or add solvent to make a slurry for the liquid-mode detection.It is one of the advantages of our non-invasive measurement compared to the traditional Mastersizer.

Heuristic pupil design strategy
We optimized the pupil a heuristic way.Fig. S8(a)(i) shows the round-shape pupil with a 4.8 mm diameter.Its power spectral density in (b)(i) is rotationally symmetric and the sidelobe energy is evenly distributed along all directions.Since the size modulation term only depends on the radius coordinate, we can squeeze the energy along the x and y directions to get a stronger sidelobe intensity by a square pupil as shown in (ii).Further introducing a quasi-periodical feature within the illumination region can induce a stronger sidelobe at the corresponding frequency in the power spectral density as shown in (iii).Here we quantitatively optimize the feature period  and the duty Its Fourier transform is, Since  is larger than /2, the position of the first sidelobe is determined by  1 = 2/.The normalized height of the first sidelobe is, Since  < ,  1 monotonously decays with .However, we cannot enhance  1 by reducing the duty cycle to zero, since we need enough photon energy to activate the CMOS detector, also we would like to cover more area on the sample surface, which is proportional to , in 1D case.
Therefore, we maximize a hand-crafted function , the product of  and  1 to balance the sidelobe height and the illumination area.
In Fig. 2a(i)-(iii), the first order sidelobe merges into the main lobe due to the motion blur.To avoid merging into the main lobe, the sidelobe position of the engineered pupil is aligned with the middle point of the first and the second sidelobe in the clear pupil with a diameter  = 4.8 mm.The position of the first sidelobe for () is  1 = 5.3, the second is  2 = 8.5, so their middle point is   = 6.9 =  1 /2.Thus, the optimized period  = 2/ 1 = /(  ) = 2.18 mm, and the bar width  = /2 = 1.09 mm, as shown in Fig. S8(a)(iii).
Moreover, we deliberately make the pupil asymmetric to have different spatial frequencies along the x and y directions as shown in (a)(iv), so that the dip along the x direction is compensated by the peak along the y direction when we only consider the radial coordinate, it is demonstrated in the power spectral spectrum (b)(iv) and the cross-sections (c)(iv).
We print the designed mask with a commercial 3D printer, and inevitably, this cheap and quick method results in a low quality pattern in the final product shown in (d)(i) and (d)(ii), the pattern at the sample surface is further deformed due to the aberration and diffraction, as presented in (d)(iii).However, this mismatch does not influence the performance much in terms of the sidelobe intensity, as shown in (b)(v) and (c)(v).Quantitively speaking, the strongest sidelobe height in the power spectral density decreases from 0.4 in (c)(vi) to 0.3 in (c)(v), still far larger than 0.01 in (c)(i).
To test the robustness of the designed pupil, we add the Gaussian noise to the pupil pattern with different noise levels .Pupil's high-frequency feature has little influence on the low-frequency region of its power spectral density, as shown in Fig. S9, cross-sections of the power spectral density are indistinguishable even for  = 0.5.The pupil performance is quite robust to the fabrication quality.In Fig. 3c, although the overall trend of the estimated PSD matches the ground truth (GT), there are noticeable discrepancies for certain particle sizes, such as some particle sizes in the third row from the left part and the second row from the right part in Fig. 3c.The anomaly discrepancies arise from the differentiation of the zig-zag shape artifacts in predicted cumulative distributions at certain particle sizes, originated from the neural network.In practice, there are two methods to sort out the influence of this anomaly effect.The first is to distinguish the artifact from the real signal by tracking the temporal evolution of the measured size distribution, for example, in Fig. 5a and Fig. 5c, there is an artifact at around 600 m and it persists along the whole drying process.

Savitzky-Golay filter for cumulative distributions
Moreover, the evolution of particle size distribution should follow the physical model, e.g.

Fig. S1 | 3 , 2 ,
Fig.S1 | A powder model sketch.The sketch only considers the top surface, which is made up of 1D powder particles, represented by red rectangles.Each particle has a radius   and a position   .The rough top edge indicates that the scattered beam has a random phase at each position on the surface.
Fig. S2 | PSD calibration curves measured from the Mastersizer serve as the ground truth for each data set.

Fig. S3 |
Fig. S3 | Neural Network Structure.The model takes the autocorrelation image as input and the cumulative distributions as output.It consists of four stages with four convolutional layers in each.IN and ReLU activation are applied before each convolutional layer.There is a flattening layer and a linear layer with Sigmoid activation connected to the output of the fourth stage.The output data dimension and the filter parameters for each layer are labeled at the bottom of this figure.We pretrain the model with 9000 synthetic autocorrelation-PSD pairs and then fine-tune the first stage (marked in the red dashed line) with the experimental data.The whole model contains 625.5k parameters while the first stage only has 4.2k parameters.

Fig. S4 |
Fig. S4 | (a) The photo of our drying filter.The crucial parts are labeled.(b) A sketch of the optics bench.A 3D-printed optical intensity mask is placed on the beam path between Lens L2 and the polarized beam splitter (PBS).Lenses L1, L2, and L3 have focal lengths of 25 mm, 30 mm, and 250 mm, respectively.The distance between the intensity mask and the lens L2 is 10 mm.The inset is the photo of the optics.(c) An example of the powder sample.

Fig. S5 ,
Fig.S5, estimated size distributions could fit the ground truth among different test sets, with a slight mismatch for small particles.

Fig. S5 |
Fig. S5 | Generality test results with NaCl powder.Test results for NaCl power with the model trained by KCl powder.Since two powders have similar physical and chemical properties, the model shows a good generality for NaCl test powder.

Fig. S6 |
Fig. S6 | Simulation test results including bimodal distributions.The model can fit some of the single peak or bimodal distributions, although there are still some failures.For the failure cases, it always catches the bigger peak and misses the smaller one.

Fig. S7 |
Fig. S7 | Results from the Mastersizer and the speckle Probe.The particle size didn't change after the KCl + EtOH/Water drying process confirmed by both methods and has a peak around 300 μm.

Fig. S8 |
Fig. S8 | Heuristic strategy of the pupil design.The pupil shapes (a), power spectral densities (b), and corresponding cross-sections (c) at each design step (i)-(v).(d) (i) The photo of the 3D printed mask.A lamp is set behind to enhance the visual quality of the pattern.(ii) A zoomin image of (i).The red circle labels the clear beam shape.Due to the limited printing quality, the fabricated pattern is slightly larger than the clear beam, the region out of the red circle is not illuminated in the experiment.(iii) The illumination pattern at the sample surface.The deformation arises from the aberration in the optical system and the diffraction in the free space.

Fig. S9 |
Fig. S9 | Noise test of the designed pupil.(a) The pupil shape with different Gaussian noise level, (i)  = 0.01, (ii)  = 0.05, (iii)  = 0.1, (iv)  = 0.5, the original pupil is normalized to 0-1.(b) The corresponding power spectral densities.(c) Cross-sections of the pupil power spectral density along the horizontal axis.Curves overlap with each other and become indistinguishable.
population balance equations.Therefore, is possible to utilize a dynamical physical model to regularize the PSD predictions, which is out of the scope of this work.The second method is to apply a hand-crafted filter to smooth the output cumulative distribution curve.Here we show results from applying the Savitzky-Golay filter to the cumulative distribution with different window widths in Fig.S10.The discrepancies are almost eliminated with a window width larger

Fig. S10 |
Fig. S10 | Smoothed particle size distributions by the Savitzky-Golay filtering.The output cumulative distributions are smoothed by the Savitzky-Golay filter with different window width, then they are differentiated to get particle size distributions.